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Abstract: Multi-objective evolutionary algorithms (MOEFAs) are typically used to optimize two or three 
objectives in the accelerator field and perform well. However, the performance of these algorithms may 
severely deteriorate when the optimization objectives for an accelerator are equal to or greater than four. 
Recently, many-objective evolutionary algorithms (MaOEAs) that can solve problems with four or more 
optimization objectives have received extensive attention. In this study, two diffraction-limited storage 
ring (DLSR) lattices of the ESRF-EBS type with different energies were designed and optimized using 
three MaOEAs and a widely used MOEA. The initial population was found to have a significant impact 
on the performance of the algorithms and was carefully studied. The performances of the four algorithms 
were compared, and the results demonstrated that the grid-based evolutionary algorithm (GrEA) had the 
best performance. MaOEAs were applied in many-objective optimization of DLSR lattices for the first 
time, and lattices with natural emittances of 116 pm-rad and 23 pm-rad were obtained at energies of 2 
GeV and 6 GeV, respectively, both with reasonable dynamic aperture and local momentum aperture 
(LMA). This work provides a valuable reference for future multi-objective optimization of DLSRs. 
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1. Introduction 


Fourth-generation synchrotron radiation light sources, also known as DLSRs [|l], have been 
constructed worldwide. Their natural emittance is approximately one to two orders of magnitude lower 
than that of third-generation light sources, generally of the order of tens or hundreds of picometers, 
which is close to the X-ray diffraction limit. DLSRs use a multibend achromat (MBA) structure and 
strong focusing magnets, which reduce emittance significantly and result in high natural chromaticity. 
High-strength sextupoles can correct chromaticity well and introduce a large nonlinear force, which can 
lead to the deterioration of the lifetime and dynamic aperture [2]. Optimizing an accelerator requires 


fulfilling many objectives, such as achieving low emittance, good nonlinear performance, and excellent 
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performance. Optimizing them simultaneously is an enormous challenge [3]. 


The optimization of a DSLR lattice is a nonlinear and intricate problem. Previously, the global scan of 
all stable settings (GLASS) [4] was used to explore the global linear properties of storage rings and 
provide a systematic method for finding stable optics. Compared to third-generation light sources, the 
number of magnets in DLSRs is remarkably higher, and GLASS consumes a significant amount of time. 
Over the last two decades, MOEAs have been widely used in accelerator optimization [5-9] to determine 
a set of reasonable solutions for each optimization objective. Among these algorithms, multi-objective 
particle swarm optimization (MOPSO) [10] and the multi-objective genetic algorithm (MOGA) [11] are 
the most widely used. Moreover, the High Energy Photon Source (HEPS) [12] study demonstrated that 
combining the MOGA and MOPSO in a rational way to optimize the lattice can yield better results than 


using only one algorithm [13]. 


Recent studies on MOEAs have indicated that classical evolutionary algorithms, such as NSGA-II [14] 
and SEPA2 [15], are in effective in dealing with many-objective optimization problems (MaOPs) if the 
number of objectives is four or more, owing to a decrease in the selection pressure on the Pareto front 
[16, 17]. When the dimensionality of the objective space increases to four or more, all solutions in a 
population become non-dominated solutions in previous generations | 18]. This degrades the performance 
of these algorithms, possibly even to the point of a random search. Because an increase in the number of 
objectives exponentially increases the number of non-dominated solutions in the population, the 
effectiveness of the crowding distance method [19] of NSGA-II and other selection mechanisms to 
ensure population diversity is reduced, causing the algorithms to converge prematurely and fall into local 


optima [20]. 


MaOEFAs have received considerable attention in recent years [21]. Various methods have been 
proposed to effectively enhance the performance of evolutionary algorithms in handling MaOPs. These 
approaches can be roughly classified into five types [22]: Pareto-dominance modification, indicator, 
decomposition, grid, and diversity-emphasis. Using these approaches, several MaOEAs can effectively 


solve MaOPs. 


In the accelerator field, the Shanghai soft X-ray free-electron laser (SXFEL) has applied NSGA-III 
[23], an MaOEA, to optimize the overcompression mode in the linac for producing large-bandwidth 
XFEL pulses [24]. Using decomposition, NSGA-II determines a well-distributed hyperplane and 
associates all solutions with the reference points in this hyperplane to enhance the selection pressure and 
maintain diversity for solving MaOPs. The SXFEL experiment demonstrates that NSGA-III significantly 


outperforms NSGA-II in the case of more than three objectives. However, the performance of NSGA-III 


varies for different test problems and is even worse than NSGA-II for some problems [25]. To further 
study the performance of MaOFAs in the accelerator field, additional algorithms must be developed and 


experiments must be conducted. 


The MOGA is not independent of the distribution of the initial population |13] and requires a good 
distribution of the initial population to converge to the true global optima [26]. A stable periodic solution 
for a lattice is called a stable solution, and a stable solution that satisfies these constraints is called a 
feasible solution. A DLSR uses many magnets, and finding a stable or feasible solution using a vast 
number of magnet combinations is difficult. In this study, the number of variables for the 2 GeV ring 
was 28, and each variable was changed once, producing 278 magnet combinations. On average, 100 
thousand randomly generated solutions produce only 10-15 stable solutions, of which only an 
extraordinarily small number are feasible. Assuming that the proportion of feasible solutions in the 
population in the early optimization stage is very small, these solutions and their feasible offspring will 
dominate all the remaining infeasible solutions. Eventually, all solutions in the population will be 
generated from these solutions, thus reducing the diversity of the population and leading algorithms to 


converge to the local optima. 


In this study, many-objective optimization of DLSR lattices is performed for the first time, and the 
performances of four different algorithms are compared in terms of many-objective optimization. Two 
DLSR lattices and strategies for their optimization are described in Section 2. The effects of the initial 
solutions obtained using three different methods on evolutionary algorithms are discussed in Section 3. 
The multi-objective optimization results with four different algorithms for the 2 GeV ring are presented 
and analyzed in Section 4. Section 5 presents the results of the multi-objective optimization of the 6 GeV 


ring. The conclusions are summarized in Section 6. 


2. Method 


2.1. Designed Lattices 


To evaluate the performance of different algorithms on the MaOPs of DLSR lattices, two lattices were 
designed with energies of 2 GeV and 6 GeV. The magnetic arrangement of one cell of the 2 GeV ring 
lattice is shown in Fig. 1 (top). The lattice was designed with a circumference of approximately 300 m 
and an H7BA focusing structure [27], which has 12 cells and provides 12 identical straight sections with 
a length longer than 5.3 m. In addition to the H7BA structure, the focusing cell comprehensively uses 
reverse bends (RBs) [28] and longitudinal gradient bends (LGBs) [29] with transverse focusing to reduce 


beam emittance as much as possible. The focusing structure uses the ESRF-EBS scheme [27], in which 


the space between two pairs of dipoles, the first and second and the sixth and seventh, is expanded to 
correct the natural beam chromaticity with weakened sextupoles because of the dispersion bump. The 
main magnets of each cell include seven combined transverse and longitudinal gradient dipoles, ten 
quadrupoles, four RBs, and six quadrupoles. As shown in Fig. 1 (bottom), a focusing quadrupole other 
than that of the 2 GeV ring is positioned next to the purple sextupole of the 6 GeV ring because storage 
rings with higher energy require a stronger focusing force. The lattice of the 6 GeV ring, which has 48 


cells, has an approximate circumference of 1350 m. 
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Fig. 1. Magnet layout of one cell each of 2 GeV and 6 GeV ring lattices. The vertical coordinate indicates the 
dipole field. For visualization convenience, the heights of the focusing quadrupole, defocusing quadrupole, 
and sextupoles are set to different fixed values, where red and purple indicate focusing and blue and green 


indicate defocusing. 


The 2 GeV and 6 GeV rings had 28 and 29 variables, respectively. For the 2 GeV ring, the maximum 
strength of the dipoles was 1.2 T; the maximum gradient of the independent quadrupoles was 60 T/m; 
and the maximum gradient of the combined magnets was 40 T/m. The corresponding maximum limits 
for the 6 GeV ring were 1.5, 80, and 60 T/m, respectively. As shown in Table 1, K1 is the gradient of the 
independent quadrupole and is combined in RBs; K2 is the gradient combined in dipoles; RBA is the 


angle of the RBs; LA is the longitudinal gradient distribution coefficient of each dipole, where the first 
dipole has the same distribution coefficient as the second; assuming that the angles of the four different 
dipoles in a cell are A, B, C, and D, MB is the angular distribution coefficient, which for A, B, C, and D 
is denoted by a, b, c, and d, respectively, where the ratios a:b:c:d = A:B:C:D; LB is the length of each 
dipole; and LD is the deviation of distances between different dipoles, while these distances are set to 


fixed values at the beginning of 


Table 1. Variables of the two DLSR lattices and their ranges. 


Variables cei aaa Units 
2GeV 6GeV 2GeV 6 GeV 

Kl 7 8 <60 <80 T/m 

K2 4 <40 <60 T/m 

RBA 2 <0.045/0.090  <0.004/0.008 rad 

LA 3 <10 

MB 4 

LB 4 [0.5, 1.5] [0.6, 1.8] m 

LD 4 m 


2.2. Algorithms 


The benchmark algorithm used in this study was NSGA-II, and the other three MaOEAs were GrEA 
[30], NSGA-HI, and MOEA/D [31]. The flowcharts of these algorithms are shown in Fig. 2, and they are 


described below. 


NSGA-II is a non-dominated sorting algorithm based on crowding distance, and its basic framework is 
described below. First, for a given initial population Q with a fixed size N, the offspring population S 
with the same size is generated by selection, crossover, and mutation. Then, N non-dominated solutions 
are selected from the merged populations of Q and S according to non-dominated sorting [23] and 
crowding distance as the new population Q. These steps are repeated until the algorithm converges or 


reaches the maximum number of iterations. 


The GrEA is a grid-based algorithm used for MaOPs. Its basic algorithm framework is similar to that 
of NSGA-II. It introduces a grid mechanism to enhance the convergence and diversity of the algorithm. 
The non-dominated sorting method in NSGA-II is replaced by grid dominance and grid difference, 


which effectively strengthen the convergence pressure to determine the Pareto front. Additionally, the 


fitness assignment process is revised through three grid-based criteria: grid ranking (GR), grid crowded 
distance (GCD), and grid coordinate point distance (GCPD). 
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Fig. 2. Flowcharts of the four algorithms. NSGA-II, GrEA, and NSGA-III have a similar framework and 


are shown in a single flowchart (left). MOEA/D is shown on the right. 


NSGA-III is an improved version of NSGA-II, which solves MaOPs by modifying the crowding 
distance operator. All steps in NSGA-III other than the step involving the crowding distance operator are 
the same as those in NSGA-II. Instead of the crowding distance operator, NSGA-III uses the following 
steps: First, many well-distributed reference points are generated using Das and Dennis's systematic 
approach [32] to maintain diversity. Each population member is then associated with its nearest reference 
line after standardization. Finally, non-dominated solutions close to the reference line are given a higher 


priority. 


MOEA/D uses uniformly distributed weight vectors to decompose a multi/many-objective problem 
into a series of single-objective sub-problems and optimizes them simultaneously. Each subproblem is 
optimized and updated with information regarding its corresponding neighborhood, which is determined 
by the distance between the individual weight vectors. Several aggregation methods, such as the 
weighted sum (WS) [33], Tchebycheff [33], and penalty-based boundary intersection (PBI) [34], are 
used to aggregate the individual objective values in MOEA/D. 


Three metrics, SumMin, MinSum, and Range, were used to evaluate the performance of the MaOEA 


[35]. Assume MaOPs, where the optimal value of each objective is the minimum value, SumMin is the 
sum of the minimum values of each objective in the population, which characterizes the convergence to 
the Pareto front edge, and MinSum is the minimum value of the sum of the individual objectives in the 
population, which characterizes the convergence to the Pareto front center. Smaller values indicate better 
convergence. Range is the sum of the range of objective values for each objective in the population, 
which characterizes the diversity of the population. The larger the value, the better the diversity. For 


clarity, the values of the three metrics were divided by the number of objectives. 


The inverted generational distance (IGD) [36] and hypervolume (HV) [37] are widely used as metrics 
to compare the performance of various MaOEAs. However, the calculation of the IGD requires 
knowledge of the true Pareto front; therefore, it was not used in this study. HV measures the combined 


performance of convergence and diversity, with a larger HV indicating better performance. 


In this study, the HV was the main metric for evaluating the performance of the different algorithms, 


and the other three metrics were used as supplements. 


2.3. Strategy 


To obtain high performance and good stability, the natural emittance, dynamic aperture, LMA, and the 
brilliance at the central bending magnet and center of the straight section were selected as the 
optimization objectives in all multi-objective optimizations. Although emittance and brilliance are not 
independent objectives, they can be used simultaneously as optimization objectives. This is because 
brilliance is not only related to the emittance but is also closely related to the beam optics at the source 
point. A lattice with a larger emittance may have higher brilliance if the former has a smaller 
B function at the source point. Moreover, several light sources have been implemented or proposed to 
replace the central bending magnet with a superbend to obtain higher-energy photons, such as 
DIAMOND [38], ALS-U [39], SLS-II [40], Sirius [41], and WHPS [42]. Therefore, using the brilliance 
at the central bending magnet as an objective is reasonable. In the following experiments, all calculations 


and simulations were performed using AT [43]. 
The two important characteristics of a storage ring are brilliance and coherence, and a very effective 
way to improve them is to reduce its natural emittance. Therefore, the natural emittance €- , was 


used as an optimization objective. A lattice with an extremely small emittance is likely to be impractical 
because the dynamic aperture and LMA may be too small for beam injection and beam lifetime. The area 


of the dynamic aperture without energy deviation was used as the optimization objective and is denoted 


as DA. A sufficient LMA is typically required to obtain the desired beam life. The LMA determined by 
the transverse beam dynamics tends to be small at locations with large dispersion invariants, such as the 
arc of the H7BA cell. In fact, the LMA is usually limited by transverse dynamics only in the arc and by 
the RF bucket height elsewhere. It tends to be the largest at the center of the straight section and the 
smallest at the position with maximum dispersion. The LMA at these two positions was obtained by 
tracking 1000 turns. The average of the LMA at the above two positions was used as the LMA 
optimization objective and is denoted as 6 [26]. The light drawn from the bending magnet can have a 


higher energy, and its brilliance [44] is 
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O > „y represent the bunch size in the horizontal or vertical direction; © y . is the effective 


vertical divergence; 0 ' is the vertical beam divergence; and 0 | is the vertical opening angle 


y r 


of synchrotron radiation. In this study, the brilliance of the center dipole BriB was used as an objective, 


and it was measured when the frequency w was equal to the characteristic frequency w , the beam 


Cc 
current was 200 mA, and the coupling ratio, defined as the ratio of the vertical emittance to the 
horizontal emittance, was 10%. An insertion device is installed at the center of the straight section to 


generate synchrotron radiation. Its brilliance [44] is given by 
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O Tx ‘ty ‘are the root mean square deviations of the transverse position and angle in horizontal or 


vertical direction; O is horizontal beam divergence; A is undulator period and 


x n 


L ip is undulator length. Brilliance BriU was generated by an undulator with a length of 200 cm, 


period of 25 mm, and K value of 1.5 at the center of the straight section when the beam current was 200 
mA and the coupling ratio was 10%. By changing the sign of the variable, all objectives were between 


the minimum value and optimal value during the optimization process. The objectives are denoted as 


E , -DA, —6, —BriB, and —BriU in the text. 


X 


Bending magnets inevitably produce dispersion that separates the off-energy orbit from the ideal 
trajectory. The high-frequency accelerating cavity and insertion devices were mounted in a straight 
section, and energy deviations were introduced. To minimize the coupling of energy variations to the 
transverse oscillations, the dispersion of the straight section for the beam lines was matched to zero when 
the multi-objective optimization was performed. Before calculating the dynamic aperture and LMA, the 
horizontal and vertical chromaticity were corrected to +1 if the momentum compaction factor was 
greater than 0; otherwise, it was corrected to -1. In addition to the limitation of the variable due to the 


power supply, the following constraints were imposed: 


(1) The values of the natural emittance and tune should be real numbers to ensure that the 


solution is stable. 
(2) The maximum ß in the horizontal and vertical directions should not be more than 30 m. 


In the experiments described below, a simulated binary crossover and polynomial mutation were used 
in the genetic operator, and their settings were the same for each algorithm. For each optimization, the 
initial population size was 1000. In the various algorithms, the specific parameters were set as follows: 
The number of grids for GrEA was nine; the number of reference points in NSGA-III was the same as 
the population size; the aggregation function used for MOEA/D was the Tchebycheff function; and the 
neighborhood size was set to 1/10 of the population size, that is, 100. 


3. Optimization results with different initial populations 


The results of the initial multi-objective optimization of the 2 GeV ring showed that most of the 
solutions in the first 100 generations of each algorithm were infeasible and that the minimum value of 
the natural emittance of the final optimization result was greater than 250 pm-rad, which was much larger 
than the expected value. Almost no feasible or stable solutions existed for the initial population. Because 


this optimization involved 28 variables, the probability of randomly generating a stable solution within 


the range of the variables was almost zero, which led to a search for stable and feasible solutions in the 
early stages of optimization. After one or several feasible solutions were found, only those that 
dominated the other solutions were used to determine the search direction. This caused the solutions to 


become trapped in local optima. 


In this section, three approaches were used to generate the initial population. In the first approach, the 
initial feasible solution was generated by the transport line design, as follows: One cell of the storage 
ring was used as the transport line, and the initial beam optical function was input at the initial point. As 


the initial point was the center of the straight section and the center of the straight section was the 


and B 


were chosen to be suitable small values, and n 


symmetry point, B j x -P : 
x 


x 
B ` ,andn ` were set to zero for the initial beam optical function. As the transport line did not 
y x 


consider a periodic solution, the question of whether a solution was stable did not arise. Then, each 
variable was changed one by one until the subsequent beam’s optical distribution met the following 
criteria: the beam optics inside the dipoles meet the low-emittance requirement; the distribution of the 
dispersion bump is suitable; and the output beam optical function is equal to or close to the initial beam 
optical function. After the first feasible solution was obtained, sufficient solutions were seeded into the 


initial population. 


In the second approach, a certain number of feasible solutions were found, and the initial population 
was generated by seeding them. Many solutions were randomly generated in the range of variables, and 
all solutions satisfying the constraints in Section 2 were feasible. However, only 63 feasible solutions 
were found in the 200 million randomly generated solutions, which took a total of 23 h 45 min with 200 
Xeon 2.4 GHz CPUs in the Supercomputing Center of Wuhan University cluster. The initial population 
was then seeded with these feasible solutions to save time and computing power. In the third approach, 
sufficiently stable solutions were searched for and selected directly as the initial population. Twenty 
million solutions were randomly generated within the appropriate ranges of the variables, and all stable 
solutions among them were found. A total of 2365 stable solutions were found in 2.4 h with the same 


CPUs described above, of which 1000 were randomly selected as the initial population. 


To investigate the effect of the initial population on the algorithms, natural emittance and dispersion 
were used as optimization objectives. NSGA-II was used to optimize them, and the maximum number of 
iterations was 300. As shown in Fig. 3, the results of the third approach at the 100 generation were 
much worse than those of the other two, and its solutions on the Pareto front were rare. The initial 
solutions generated by the third approach were randomly generated stable solutions, most of which did 


not satisfy the given constraints and led to a search for feasible solutions in early optimization and poor 


population diversity. For different generations, the first approach always had the best Pareto front, and 
the second approach was the second best. The first approach had the best performance because its initial 
solution was seeded with a solution with low-emittance characteristics and zero dispersion; however, it 


did not necessarily perform equally well in optimizations with more than three objectives. 
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Fig. 3. The Pareto front for two objectives, natural emittance and horizontal dispersion at the center of the 
straight section obtained with NSGA-II, for different generations and initial populations generated by three 


approaches. 


Table 2. Diversity and convergence metrics for the last generation corresponding to different 


approaches. 
SumMin MinSum Range HV 
Approach 1 0.2594 0.5386 0.7406 0.0494 
Approach 2 0 0.4036 0.8627 0.1322 


The first and second approaches produced different initial populations, which were then subjected 
to multi-objective optimization using NSGA-II with a maximum of 250 iterations. The algorithm 
converged in the last iteration. The data in Table 2 show the results after normalizing each objective 
value between zero and one. This normalization was also performed for all subsequent results. As shown 
in Table 2, SumMin in the second approach was 0, which indicates that the second approach yielded 
smaller values for each objective as well as for its MinSum. The Range and HV of the second approach, 
on the other hand, were larger. Figure 4 shows that the second approach had a smaller lower bound and 
broader distribution, which is consistent with the metric results above. We also studied the effect of 
different approaches on other algorithms and found that it was similar to the effect on the NSGA-IL. 


These results show that the performance of the second approach in multi-objective optimization is better 


than that of the first approach. 
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Fig. 4. Parallel coordinate plots of all solutions corresponding to two approaches for the last generation. The 
horizontal coordinates represent the individual objectives, the vertical coordinates represent the magnitudes of 


the normalized objective values, and each line represents a solution. 


Although the first approach had the best performance in two-objective optimization, it did not perform 
as well as the second approach in multi-objective optimization. The initial population generated by the 
first approach was seeded with only one feasible solution, such that the variable space of the population 
was around this initial solution and smaller than that of the second approach. The second approach 
produced a larger variable space for the initial population, which yielded better results in multi-objective 
optimization. When the number of variables is large, leading to a large variable space, none of the 
randomly generated stable solutions are likely to be feasible, or only a few of them are. This can cause a 
reduction in population diversity and trap the algorithm in local optima. To find global optimal solutions, 
sufficient time must be spent finding additional initial feasible solutions for significant variable space 
problems. Therefore, the second approach was used to generate the initial population for the subsequent 
experiments. Recently, machine learning techniques have been used as substitutes for simulations to 
evaluate solutions in optimization algorithms [45, 46]. If machine learning is used, then the 
computational resource consumption required for the second approach is likely to be significantly 
reduced. Furthermore, incorporating machine learning into the iterative process of evolutionary 
algorithms can lead to significant reductions in the required computational resources and potentially 


yield better results. 


4. Optimization results for the 2 GeV ring 


4.1. Results 


In this section, the results of the many-objective optimizations of the 2 GeV ring lattice using four 
different evolutionary algorithms are described. The maximum number of iterations was set to 250 
because all algorithms converge at the 250" iteration. Table 3 shows that the algorithms, in descending 
order of HV, were GrEA, NSGA-III, NSGA-II, and MOEA/D. Thus, GrEA showed the best 
combined performance in terms of convergence and diversity. Furthermore, the ascending order of 
SumMin and MinSum and the descending order of Range were the same as above, as shown in Table 
3. As shown in Fig. 5, GrEA had the smallest lower bound for all objectives and was the most 
widely distributed; NSGA-III was second; MOEA/D had a very high lower bound, and the solutions 
were distributed in a narrow region. This indicates that the GrEA exhibited the best performance, 


followed by NSGA-II, NSGA-II, and MOEA/D. 


Table 3. Diversity and convergence metrics of four algorithms for the last generation. 


SumMin MinSum Range HV 
NSGA-II 0.0803 0.4360 0.9149 0.0836 
GrEA 0.0046 0.4191 0.9648 0.1239 
NSGA-III 0.0291 0.4307 0.9410 0.0909 
MOEA/D 0.3585 0.5389 0.5007 0.0516 
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Fig. 5. Parallel coordinate plots of all solutions for four algorithms in the last generation. 


Figure 6 illustrates the projection of the Pareto fronts obtained by the four algorithms in the last 
generation. Overall, the performance of NSGA-III was close to but slightly worse than that of GrEA. 
When the emittance was less than 100 pm rad, the dynamic apertures of all the solutions were so 
small that the particles could not survive in the long term. When the emittance was in the preferred 
interval of 100—150 pm rad, where the solutions are more relevant for practical requirements, the 
maximum dynamic aperture, LMA, and BriB were obtained using GrEA. The BriU obtained using 
GrEA was close to that obtained using NSGA-III and larger than the corresponding values from the 
other two algorithms. When the emittance was between 250 and 350 pm rad, the dynamic aperture 
obtained using NSGA-II was significantly smaller than that obtained using GrEA and NSGA-III; 
however, the BriU was high. This suggests that NSGA-II did not maintain diversity in this 
emittance interval, resulting in a very small B-function at the center of the straight section for 
solutions with a small dynamic aperture and high BriU. A large dynamic aperture and high BniU in 
this emittance interval could be obtained when GrEA was used, which indicates that GrEA could 
better maintain diversity. The solutions of MOEA/D were clustered in a small region, which 
indicates their poor ability to maintain diversity, causing the algorithm output to fall into local 
optima. The results in the objective space show that GrEA performed the best, NSGA-III was 
slightly inferior, and MOEA/D performed the worst, which is consistent with the metrics used to 


evaluate the performance of the algorithms. 
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Fig. 6. Projection of the Pareto fronts obtained by four algorithms at the last generation. Black crosses 


represent two selected lattices. 


Because an increase in the number of objectives causes the number of non-dominated solutions to 
increase exponentially, all solutions in the population from approximately the 30" generation were non- 
dominated solutions. Choosing the best or preferred solution from among many solutions is a very 
difficult task. In this study, feature extraction was used to determine the preferred solution. Natural 
emittance, the smaller of the positive or negative maximum horizontal dynamic aperture, maximum 
vertical dynamic aperture, minimum LMA in the entire ring, as well as at the center of the straight 
section, and brilliance at the center of the straight section, as well as at the central bending magnet, were 
extracted as features and sorted in ascending order of natural emittance. Through feature extraction, the 
information about each solution can be observed directly, such that the desired solutions can be easily 
selected. With a sufficient dynamic aperture and LMA, solutions with smaller emittance and higher 
brilliance are preferred. Two competing solutions were selected based on the feature extraction of the last 
generation of the GrEA. The beam optics of these two lattices are shown in Fig. 7, whose natural 
emittances are 116.70 pm-rad and 124.00 pm-rad, respectively. Figure 8 shows that the horizontal 
dynamic aperture of the first lattice without energy deviation reached 7.5 mm and the vertical aperture 
reached 3.7 mm, whereas the corresponding values for the second lattice were 7.8 and 6.8 mm, 


respectively. Both lattices had a certain dynamic aperture for particles with a + 2% energy deviation. As 
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shown in Fig. 9, the LMAs of both lattices at the center of the straight section were sufficiently large, 


and the minimum LMAs in one cell are both larger than 2%. 
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Fig. 7. Beam optics of two selected 2 GeV ring lattices, the first with a natural emittance of 116.70 pm-rad 
and circumference of 283.71 m, and the other with a natural emittance of 124.00 pm-rad and circumference of 


284.87 m. 


5l —e—ô=0 || —e— §=0 
—0— 5=2% 
6 —=— 5=-2%| 
€ 
E4 
> 
2 
0 i i 
-5 0 5 
x (mm) x (mm) 


Fig. 8. Dynamic apertures of two selected 2 GeV ring lattices with different energy deviations at the center of 


the straight section. 


Fig. 9. Local momentum apertures within one cell of the two selected 2 GeV ring lattices. 


4.2. Discussion 


Because diversity represents the size of the exploration space, the greater the diversity, the larger the 
exploration space and the higher the probability of finding better solutions, which leads to better 
convergence. MOEA/D divides the population into T neighborhoods, and solutions only crossover and 
update the neighborhoods to which they belong. If one solution performs better in a problem that has 
another solution, the other solution is replaced by this one. Because optimization of the 2 GeV ring is a 
problem with stricter constraints, only a small number of solutions in the initial population are feasible. 
This means that these feasible solutions replace all infeasible solutions in the early evolutionary stage, 
which leads to a significant decrease in the number of solutions that are unique to the population. As 
shown in Fig. 10, the number of unique solutions in the population of MOEA/D was small in the early 
evolutionary stage. After approximately 50 generations, the number of unique solutions fluctuated 
between 400 and 600, whereas the unique solutions of the other three algorithms were close to 1000 for 
all generations. Therefore, the small number of unique solutions in MOEA/D leads to a significant 


decrease in diversity, which severely degrades its performance. 
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Fig. 10. Number of unique solutions in population with different generations of MOEA/D. 


NSGA-III uses the achievement scalar function (ASF) [47] to calculate the extreme points 
corresponding to each objective axis and then forms a hyperplane through these points, as shown on the 
left in Fig. 11. All the solutions were normalized by dividing the intercepts formed by this hyperplane by 
each objective axis. After normalization, the reference points uniformly distributed on the hyperplane 
intersecting each objective axis with an intercept of 1 were connected to the origin to form a series of 


reference lines. The process of generating reference lines in a two-dimensional objective space is shown 


on the right side of Fig. 11. NSGA-II balances diversity and convergence based on the relationship 


between solutions and reference lines. Therefore, the ability to find a suitable hyperplane directly 


determines the performance of NSGA-III. 
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Fig. 11. Schematic of the calculation of the intercepts formed by the hyperplane passing through the 
extreme points in a three-dimensional objective space (left), where Zi, max represents the extreme point 
corresponding to the i-th objective axis and Ij represents the intersection of the hyperplane formed by the 
extreme points with the i-th objective axis. Uniformly distributed reference points are used to form 


reference lines through the origin in a two-dimensional objective space (right) for a uniform division of 


objective space. 


As the optimization objectives for the DLSR lattices increase, the shape of the objective space 
becomes more complex. Consequently, the ASF encounters the problems of a negative intercept and an 
inability to determine a hyperplane, as mentioned by Deb et al. [48]. In this study, we used the maximum 
of the non-dominated front (MNDF) [48] instead of the ASF to determine the hyperplane. The MNDF 
was used to determine the hyperplane by considering the maximum value of each objective of the Pareto 

front as the intercept corresponding to each objective axis. Although MNDF was used as a substitute for 
ASF to determine the hyperplane, NSGA-III did not perform as well as GrEA, probably because the 
hyperplane found by MNDF did not allow NSGA-III to operate at full capacity. Identifying a hyperplane 
in a complex objective space that allows the mechanism of NSGA-III to function perfectly is difficult 
and requires further research. 
The GrEA divides the hyperspace into a specified number of grids. Solutions closest to the ideal 


point or closest to the endpoint of the grid closest to the ideal point are given higher priority to enhance 


the selection pressure. To increase diversity, solutions with fewer solutions in and around the grid are 
given higher priority. For example, as shown in Fig. 12, the objective space is divided into five segments 
in each dimension, with the ideal point located in the lower left corner of the square. All five solutions 
are non-dominated. For A and B, the grid coordinates are (0, 4) and (1, 4), and the distances of the grid 
from the ideal point (GR) are 4 and 5 (i.e., 0 + 4 = 4, 1 + 4 = 5). Because the GR of A is smaller than that 
of B, A is given a higher priority. Solutions B and C are in the same grid, and both have a GR of 5; 
however, B is given a higher priority because it is closer to the endpoint. To increase diversity, E is given 


a higher priority than A because more solutions exist in the grid near A. 


Fig. 12. Illustration of solutions in a two-dimensional objective space divided into a specified number of 


grids. 


Instead of finding a suitable hyperplane in a complex objective space, using uniformly distributed 
grids to achieve a good balance between convergence and diversity is highly effective and robust. 
However, this leads to a distribution of converged solutions of the GrEA near the vertices of the different 
grids near the ideal point. Specifically, the distribution of solutions in hyperspace is a series of 
independent blocks rather than a uniform surface. As shown in Figs. 6 and 13, the Pareto front consisted 


of many inhomogeneous blocks. The GrEA exhibited good parallelism and can be easily used for 


massive parallel computations. The computation time of the GrEA was almost negligible compared to 
that of the complex physical simulations. The parameter that had the greatest impact on the GrEA was 
the number of divisions in each dimension, which is recommended to be set to nine for an unknown 
problem or a slightly higher (or lower) value when a good approximation of the Pareto front for the 


problem is too difficult to achieve [30]. 


5. Optimization results for the 6 GeV ring 


Section 4 demonstrates that the GrEA exhibits outstanding performance. This section uses the GrEA to 
perform a multi-objective optimization of the 6 GeV ring. The second approach generated an initial 
population with a size of 1000, and the maximum number of iterations was 250. As shown in Fig. 13, the 
Pareto front started to change slightly and converged at the 200" generation. After convergence, the 


solutions were uniformly distributed in individual blocks, which agrees with the principle of GrEA. 
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Fig. 13. Projection of Pareto front for different generations. Black crosses represent the two selected lattices. 


As shown in Fig. 14, two representative lattices were chosen that had similar emittance, but the first 


lattice had a smaller value of the B function at the center of the straight section as well as at the 


y 
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central dipole, which gave it a higher brilliance at these two locations but with a smaller dynamic 
aperture. As shown in Fig. 15, the horizontal dynamic aperture of the first lattice reached 2.1 mm while 
its vertical aperture reached 2.1 mm, and the horizontal aperture of the second lattice reached 4.5 mm 
while its vertical aperture reached 4.4 mm. They all had a certain dynamic aperture for particles with a + 
2 energy deviation. Figure 16 shows that both lattices had good LMAs, and the minimum LMAs in one 


cell were larger than 2%. 
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Fig. 14. Beam optics of the selected 6 GeV ring lattices, the first with a natural emittance of 23.08 pm-rad and 
circumference of 1336.25 m and the other with a natural emittance of 25.03 pm rad and circumference of 


1327.21 m. 
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Fig. 15. Dynamic apertures of the two selected 6 GeV ring lattices with different energy deviations at the 


center of the straight section. 


Fig. 16. Local momentum apertures within one cell of the two selected 6 GeV ring lattices. 


6. Conclusion 


Two ESRF-EBS-type lattices with different energies were designed for the experiments. The 
distribution of the initial population affected algorithm performance. For MaOPs with strict constraints 
and no preferences, increasing the number of randomly generated feasible solutions can enhance the 
diversity of the population and potentially improve the algorithm’s performance. Three different 
MaOEAs and NSGA-II, a widely used MOEA, were used to perform multi-objective optimization of the 
2 GeV ring. GrEA performed the best because of its ability to efficiently and robustly solve MaOPs. 
Owing to the complex shape of the objective space, the ASF used by NSGA-III to find a hyperplane 
failed in this optimization. After using MNDF as a substitute for ASF, NSGA-III performed better than 
NSGA-II but slightly worse than GrEA. MOEA/D could not handle the constraints well, showing an 
inability to maintain diversity and falling into local optima. Moreover, its performance was even worse 
than that of NSGA-II. Representative lattices with good attributes were selected from the GrEA results 
via feature extraction. GrEA was used to perform the many-objective optimization of the 6 GeV ring, 
and its application was demonstrated on two lattices with different beam optics at the center of the 


straight section and the central dipole. 


This study uses MaOEFAs for the first time to perform multi-objective optimization of DLSR lattices 
and illustrates that the GrEA is robust and performs well. Solutions with promising properties to satisfy 
all the objectives can be obtained for lattices with different energies. Therefore, this optimization strategy 
can be easily applied to other DLSRs prepared for construction or to other aspects of accelerators with 


more than three objectives. 
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